2,269 research outputs found

    Counting, generating and sampling tree alignments

    Get PDF
    Pairwise ordered tree alignment are combinatorial objects that appear in RNA secondary structure comparison. However, the usual representation of tree alignments as supertrees is ambiguous, i.e. two distinct supertrees may induce identical sets of matches between identical pairs of trees. This ambiguity is uninformative, and detrimental to any probabilistic analysis.In this work, we consider tree alignments up to equivalence. Our first result is a precise asymptotic enumeration of tree alignments, obtained from a context-free grammar by mean of basic analytic combinatorics. Our second result focuses on alignments between two given ordered trees SS and TT. By refining our grammar to align specific trees, we obtain a decomposition scheme for the space of alignments, and use it to design an efficient dynamic programming algorithm for sampling alignments under the Gibbs-Boltzmann probability distribution. This generalizes existing tree alignment algorithms, and opens the door for a probabilistic analysis of the space of suboptimal RNA secondary structures alignments.Comment: ALCOB - 3rd International Conference on Algorithms for Computational Biology - 2016, Jun 2016, Trujillo, Spain. 201

    Respiratory risks in broiler production workers

    Get PDF
    There are many situations that involve health risks to the Brazilian rural worker, and animal production is just one of them. Inhalation of organic dust, which has many microorganisms, leads in general to respiratory allergic reactions in some individuals, asthma-like syndrome, and mucous membrane inflammation syndrome, that is a complex of nasal, eye, and throat complaints. Furthermore, workers might have farmer's hypersensitivity pneumonia, that is a respiratory health risk along the years. The objective of this study was to evaluate the potential pulmonary health risks in poultry production workers in the region of Curitiba, PR, Brazil. Interviews using a pre-elaborated questionnaire with 40 questions were made with 37 broiler production workers, which were submitted to a pulmonary function test. Results of restrictive function with lower FEV1 (the maximum respiratory potential, the forced expiratory volume in the first second of exhalation) and FVC (forced vital capacity) represented 24.32% of the total of workers, and severe obstruction represented 2.70%. Other symptoms were found in 67.57% of the workers as well. The results showed that those who work more than 4 years and within more than one poultry house, exceeding 5 hours per day of work, presented higher pulmonary health risks. It is concluded that the activities within broiler houses may induce allergic respiratory reaction in workers. The use of IPE (individual protection equipment) besides special attention to the air quality inside the housing may be advised in a preventive way.2329Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP

    Gene translocation links insects and crustaceans

    Full text link
    Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/62560/1/392667a0.pd

    PicXAA-R: Efficient structural alignment of multiple RNA sequences using a greedy approach

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Accurate and efficient structural alignment of non-coding RNAs (ncRNAs) has grasped more and more attentions as recent studies unveiled the significance of ncRNAs in living organisms. While the Sankoff style structural alignment algorithms cannot efficiently serve for multiple sequences, mostly progressive schemes are used to reduce the complexity. However, this idea tends to propagate the early stage errors throughout the entire process, thereby degrading the quality of the final alignment. For multiple protein sequence alignment, we have recently proposed PicXAA which constructs an accurate alignment in a non-progressive fashion.</p> <p>Results</p> <p>Here, we propose PicXAA-R as an extension to PicXAA for greedy structural alignment of ncRNAs. PicXAA-R efficiently grasps both folding information within each sequence and local similarities between sequences. It uses a set of probabilistic consistency transformations to improve the posterior base-pairing and base alignment probabilities using the information of all sequences in the alignment. Using a graph-based scheme, we greedily build up the structural alignment from sequence regions with high base-pairing and base alignment probabilities.</p> <p>Conclusions</p> <p>Several experiments on datasets with different characteristics confirm that PicXAA-R is one of the fastest algorithms for structural alignment of multiple RNAs and it consistently yields accurate alignment results, especially for datasets with locally similar sequences. PicXAA-R source code is freely available at: <url>http://www.ece.tamu.edu/~bjyoon/picxaa/</url>.</p

    Optimizing substitution matrix choice and gap parameters for sequence alignment

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>While substitution matrices can readily be computed from reference alignments, it is challenging to compute optimal or approximately optimal gap penalties. It is also not well understood which substitution matrices are the most effective when alignment accuracy is the goal rather than homolog recognition. Here a new parameter optimization procedure, POP, is described and applied to the problems of optimizing gap penalties and selecting substitution matrices for pair-wise global protein alignments.</p> <p>Results</p> <p>POP is compared to a recent method due to Kim and Kececioglu and found to achieve from 0.2% to 1.3% higher accuracies on pair-wise benchmarks extracted from BALIBASE. The VTML matrix series is shown to be the most accurate on several global pair-wise alignment benchmarks, with VTML200 giving best or close to the best performance in all tests. BLOSUM matrices are found to be slightly inferior, even with the marginal improvements in the bug-fixed RBLOSUM series. The PAM series is significantly worse, giving accuracies typically 2% less than VTML. Integer rounding is found to cause slight degradations in accuracy. No evidence is found that selecting a matrix based on sequence divergence improves accuracy, suggesting that the use of this heuristic in CLUSTALW may be ineffective. Using VTML200 is found to improve the accuracy of CLUSTALW by 8% on BALIBASE and 5% on PREFAB.</p> <p>Conclusion</p> <p>The hypothesis that more accurate alignments of distantly related sequences may be achieved using low-identity matrices is shown to be false for commonly used matrix types. Source code and test data is freely available from the author's web site at <url>http://www.drive5.com/pop</url>.</p

    Novel associations for hypothyroidism include known autoimmune risk loci

    Get PDF
    Hypothyroidism is the most common thyroid disorder, affecting about 5% of the general population. Here we present the first large genome-wide association study of hypothyroidism, in 2,564 cases and 24,448 controls from the customer base of 23andMe, Inc., a personal genetics company. We identify four genome-wide significant associations, two of which are well known to be involved with a large spectrum of autoimmune diseases: rs6679677 near _PTPN22_ and rs3184504 in _SH2B3_ (p-values 3.5e-13 and 3.0e-11, respectively). We also report associations with rs4915077 near _VAV3_ (p-value 8.3e-11), another gene involved in immune function, and rs965513 near _FOXE1_ (p-value 3.1e-14). Of these, the association with _PTPN22_ confirms a recent small candidate gene study, and _FOXE1_ was previously known to be associated with thyroid-stimulating hormone (TSH) levels. Although _SH2B3_ has been previously linked with a number of autoimmune diseases, this is the first report of its association with thyroid disease. The _VAV3_ association is novel. These results suggest heterogeneity in the genetic etiology of hypothyroidism, implicating genes involved in both autoimmune disorders and thyroid function. Using a genetic risk profile score based on the top association from each of the four genome-wide significant regions in our study, the relative risk between the highest and lowest deciles of genetic risk is 2.1

    A user-friendly web portal for T-Coffee on supercomputers

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Parallel T-Coffee (PTC) was the first parallel implementation of the T-Coffee multiple sequence alignment tool. It is based on MPI and RMA mechanisms. Its purpose is to reduce the execution time of the large-scale sequence alignments. It can be run on distributed memory clusters allowing users to align data sets consisting of hundreds of proteins within a reasonable time. However, most of the potential users of this tool are not familiar with the use of grids or supercomputers.</p> <p>Results</p> <p>In this paper we show how PTC can be easily deployed and controlled on a super computer architecture using a web portal developed using Rapid. Rapid is a tool for efficiently generating standardized portlets for a wide range of applications and the approach described here is generic enough to be applied to other applications, or to deploy PTC on different HPC environments.</p> <p>Conclusions</p> <p>The PTC portal allows users to upload a large number of sequences to be aligned by the parallel version of TC that cannot be aligned by a single machine due to memory and execution time constraints. The web portal provides a user-friendly solution.</p

    Rare coding SNP in DZIP1 gene associated with late-onset sporadic Parkinson's disease

    Get PDF
    We present the first application of the hypothesis-rich mathematical theory to genome-wide association data. The Hamza et al. late-onset sporadic Parkinson's disease genome-wide association study dataset was analyzed. We found a rare, coding, non-synonymous SNP variant in the gene DZIP1 that confers increased susceptibility to Parkinson's disease. The association of DZIP1 with Parkinson's disease is consistent with a Parkinson's disease stem-cell ageing theory.Comment: 14 page

    Who Watches the Watchmen? An Appraisal of Benchmarks for Multiple Sequence Alignment

    Get PDF
    Multiple sequence alignment (MSA) is a fundamental and ubiquitous technique in bioinformatics used to infer related residues among biological sequences. Thus alignment accuracy is crucial to a vast range of analyses, often in ways difficult to assess in those analyses. To compare the performance of different aligners and help detect systematic errors in alignments, a number of benchmarking strategies have been pursued. Here we present an overview of the main strategies--based on simulation, consistency, protein structure, and phylogeny--and discuss their different advantages and associated risks. We outline a set of desirable characteristics for effective benchmarking, and evaluate each strategy in light of them. We conclude that there is currently no universally applicable means of benchmarking MSA, and that developers and users of alignment tools should base their choice of benchmark depending on the context of application--with a keen awareness of the assumptions underlying each benchmarking strategy.Comment: Revie

    Release of Lungworm Larvae from Snails in the Environment: Potential for Alternative Transmission Pathways

    Get PDF
    Background: Gastropod-borne parasites may cause debilitating clinical conditions in animals and humans following the consumption of infected intermediate or paratenic hosts. However, the ingestion of fresh vegetables contaminated by snail mucus and/or water has also been proposed as a source of the infection for some zoonotic metastrongyloids (e.g., Angiostrongylus cantonensis). In the meantime, the feline lungworms Aelurostrongylus abstrusus and Troglostrongylus brevior are increasingly spreading among cat populations, along with their gastropod intermediate hosts. The aim of this study was to assess the potential of alternative transmission pathways for A. abstrusus and T. brevior L3 via the mucus of infected Helix aspersa snails and the water where gastropods died. In addition, the histological examination of snail specimens provided information on the larval localization and inflammatory reactions in the intermediate host. Methodology/Principal Findings: Twenty-four specimens of H. aspersa received ~500 L1 of A. abstrusus and T. brevior, and were assigned to six study groups. Snails were subjected to different mechanical and chemical stimuli throughout 20 days in order to elicit the production of mucus. At the end of the study, gastropods were submerged in tap water and the sediment was observed for lungworm larvae for three consecutive days. Finally, snails were artificially digested and recovered larvae were counted and morphologically and molecularly identified. The anatomical localization of A. abstrusus and T. brevior larvae within snail tissues was investigated by histology. L3 were detected in the snail mucus (i.e., 37 A. abstrusus and 19 T. brevior) and in the sediment of submerged specimens (172 A. abstrusus and 39 T. brevior). Following the artificial digestion of H. aspersa snails, a mean number of 127.8 A. abstrusus and 60.3 T. brevior larvae were recovered. The number of snail sections positive for A. abstrusus was higher than those for T. brevior. Conclusions: Results of this study indicate that A. abstrusus and T. brevior infective L3 are shed in the mucus of H. aspersa or in water where infected gastropods had died submerged. Both elimination pathways may represent alternative route(s) of environmental contamination and source of the infection for these nematodes under field conditions and may significantly affect the epidemiology of feline lungworms. Considering that snails may act as intermediate hosts for other metastrongyloid species, the environmental contamination by mucus-released larvae is discussed in a broader context
    corecore